perm filename C.2[CYB,DBL] blob sn#154455 filedate 1975-04-15 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00015 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00003 00002	.DEVICE XGP
C00005 00003	.PORTION MACROS
C00009 00004	.NSECP(Theory)
C00019 00005	.SSEC(How to Build Such a System)
C00023 00006	.NSEC(Application to Automatic Programming)
C00031 00007	.SSEC(Details of the Experimental System)
C00041 00008	.SSEC(Results)
C00047 00009	.SSEC(Conclusions)
C00053 00010	.NSEC(Application to Mathematics Research)
C00060 00011	.SSEC(Characteristics of a Proposed System)
C00087 00012	.SSEC(Projected Results)
C00091 00013	.SSEC(Synthesis)
C00093 00014	.ASEC(References)
C00098 00015	.EVERY HEADING(,,)
C00100 ENDMK
C⊗;
.DEVICE XGP
.!XGPCOMMANDS←"/TMAR=30/PMAR=2150/BMAR=20"

.FONT 1 "BASL30"
.FONT 2 "BASB30"
.FONT 4  "BASI30"
.FONT 5  "NGR40"
.FONT 6  "NGR25"
.FONT 7  "NGR20"
.FONT 8  "GRFX35"
.FONT 9  "FIX20"
.TURN ON "↑α↓_π[]{"
.TURN ON "⊗" FOR "%"
.TURN ON "@" FOR "%"
.PAGE FRAME 63 HIGH 89 WIDE
.TITLE AREA HEADING LINES 1 TO 2
.AREA TEXT LINES 4 TO 62  CHARS 1 TO 89
.NARROW 6,8
.COUNT PAGE PRINTING "1"
.TABBREAK
.!XGPLFTMAR←133
.AT "ffi" ⊂ IF THISFONT ≤ 4 THEN "≠"  ELSE "fαfαi" ⊃;
.AT "ffl" ⊂ IF THISFONT ≤ 4 THEN "α∞" ELSE "fαfαl" ⊃;
.AT "ff"  ⊂ IF THISFONT ≤ 4 THEN "≥"  ELSE "fαf" ⊃;
.AT "fi"  ⊂ IF THISFONT ≤ 4 THEN "α≡" ELSE "fαi" ⊃;
.AT "fl"  ⊂ IF THISFONT ≤ 4 THEN "∨"  ELSE "fαl" ⊃;
.PORTION MACROS
.SELECT 1
.MACRO B ⊂ BEGIN VERBATIM GROUP ⊃
.MACRO BB ⊂ BEGIN NOFILL SELECT 6 INDENT 0 GROUP ⊃
.MACRO E ⊂ APART END ⊃
.MACRO D ⊂ ONCE PREFACE 100 MILLS ⊃
.MACRO BBB ⊂ BEGIN  SINGLE SPACE; PREFACE 1; NOFILL GROUP ⊃
.MACRO BB2 ⊂ BEGIN  SINGLE SPACE; FILL ⊃
.MACRO B4 ⊂ BEGIN NOFILL INDENT 0  SELECT 2  WIDEN 6,8  PREFACE 10 MILLS ⊃
.MACRO B5 ⊂ BEGIN FILL INDENT 0  SELECT 7  PREFACE 5 MILLS  SPACING 0 MILLS ⊃
.MACRO FAD ⊂FILL ADJUST COMPACT DOUBLE SPACE; PREFACE 2 ⊃
.FAD

.AT "$$" ENTRY "*" ⊂ XGENLINES←XGENLINES-1; "↑*↑*" ;
.SEND FOOT ⊂ TURN ON "{" SELECT 7; SPACING 0; PREFACE 0; INDENT 0,10
↑*↑* ENTRY
.BREAK ⊃ ⊃
.FOOTSEP←"****************************************************************************************"

.MACRO NSECP(A)  ⊂   SSECNUM←0
.SECNUM←SECNUM+1 
.SECTION←"A"
.SKIP TO COLUMN 1
.TURN ON "{∞→"   
.SEND CONTENTS ⊂
@1{SECNUM}. A⊗* ∞.→ {PAGE}
.⊃
.TURN OFF "{∞→"   
.ONCE CENTER TURN ON "{}"
@5↓_{SECNUM}. A_↓⊗*  
.  ⊃


.MACRO ASEC(A)  ⊂  SSECNUM←0
.SECTION←"A"
.SKIP TO COLUMN 1
.TURN ON "{∞→"   
.SEND CONTENTS ⊂
@1 A⊗*
.⊃
.TURN OFF "{∞→"   
.ONCE CENTER TURN ON "{}"
@5↓_A_↓⊗*  
.  ⊃


.MACRO NSEC(A)  ⊂  
.SSECNUM←0
.SECNUM←SECNUM+1 
.TURN ON "{∞→"   
.SEND CONTENTS ⊂
@1{SECNUM}. A⊗* ∞.→ {PAGE}
.⊃
.TURN OFF "{∞→"   
.SECTION←"A"
.GROUP SKIP 3
.ONCE CENTER TURN ON "{}"
@5↓_{SECNUM}. A_↓⊗*  
.  ⊃


.MACRO SSEC(A)  ⊂  TURN ON "{∞→"   
.SSECNUM←SSECNUM+1
.SSSECNUM←0
.SEND CONTENTS ⊂
@7               A⊗* ∞.→ {PAGE}
.⊃
.TURN OFF "{∞→"   
.ONCE INDENT 6 TURN ON "{}"
@2↓_{SECNUM}.{SSECNUM}. A_↓⊗*  
. ⊃

.MACRO ASSEC(A)  ⊂  TURN ON "{∞→"   
.SEND CONTENTS ⊂
@7               A⊗*
.⊃
.TURN OFF "{∞→"   
.ONCE INDENT 6 TURN ON "{}"
@2↓_A_↓⊗*  
. ⊃

.MACRO SSSEC(A)  ⊂  TURN ON "{∞→"   
.SSSECNUM←SSSECNUM+1
.TURN OFF "{∞→"   
.ONCE INDENT 1 TURN ON "{}"
@2↓_{SECNUM}.{SSECNUM}.{SSSECNUM}. A_↓⊗*  
. ⊃

.SECNUM←0
.PAGE←0
.NEXT PAGE
.TURN OFF "{"
.INDENT 0
.SELECT 1
.INSERT CONTENTS
.PAGE←0
.PORTION THESIS
.TURN OFF "{∞→}"   
.EVERY HEADING(⊗7Douglas B. Lenat,Interacting Knowledge Modules,{DATE}      page {PAGE}⊗*)
.PAGE←1
.NSECP(Theory)

.SSEC(Origins of the Constraints)

Despite the awesome physical forces affecting Earth (weather, gravitation,... ),
it is ⊗4biological⊗* entities
who dominate our planet.  We shall assume that life "works",  
extract some  obvious characteristics of
living organisms (especially of Man), and then design a program which incorporates
them.

Perhaps the most obvious distinction between, e.g., Weather and Animals, is that
of ⊗4discreteness⊗*. Yet besides  having distinguishable boundaries, each
species has a specific ⊗4anatomy⊗*. That is, each such creature is alike in
many important functional, physiological, structural, and psychological respects.
More than uniformity is required:
⊗4Individual abilities⊗* must be recognized. Despite their similarity,
each member of a species is slightly different in his perception of
features of his environment, in his responses to those features, 
in his drives, and in his
capabilites. In many species, including humans, this has been carried to the extreme
of ⊗4professions.⊗*  A collection
of diverse experts is more viable than a similar collection of generalists.

So we limit our attention to  a community of discrete entities, 
sensitive to the world around them, each with his own distinct
expertise, yet all anatomically and psychologically similar.
How do they ⊗4interact⊗* in a meaningful, productive manner? 
Biology would lead us to hypothesize ⊗4competition⊗*, but 
perhaps the key is
⊗4cooperation.⊗*  Specialization is of advantage only if each specialist is willing
to use his talent to serve the entire community. In part this may be instinctive,
but much of Man's supremacy stems from learning to cooperate. The institutions
of the law, education, and government universally reflect (distortions of) this
ideal. 
Advancement ⊗4within⊗* a society may be by competition, 
but advancement ⊗4of⊗* a society
necessitiates synergy.

But how do people help each other; what is the "mechanism"? 
Perhaps it lies in ⊗4communication⊗*. This has three aspects: 
(i) broadcasting your own needs to the specialists who might help you, or to the
entire community; (ii) recognizing
when you are relevant to satisfying someone else's specific need; (iii) satisfying
that need, often by posing new sub-requests to the community.
Besides communication, another constructive process of living 
entities is ⊗4reproduction.⊗*
Heredity and teaching ensure that the offspring will be 
somewhat similar to his parents, 
yet not identical.

These characteristics are now ⊗4postulated⊗* as constraints on the computer
simulation we wish to construct. The program will be made up of discrete modules,
which we shall call BEINGs, each possessing a clump of specialized knowledge.
Each BEING shall have the same internal structure, can respond to the same kinds
of questions (though with different answers in each case). The only things that
a BEING can do are: broadcast a plea for assistance, recognize his own ability to
help in a certain situation, provide the needed answer in such cases, combine
with  0,1, or 2 other BEINGs to produce a new BEING, and teach that offspring
some of what it knows.

The reader may protest that much has been ignored: primitive drives, 
instincts, emotions,
aesthetics, ethics,...  Some of these will be dealt with in Section 3, when we
discuss design issues for a system which can make simple mathematical discoveries.
Such a system must have a sense of aesthetic beauty, harmony, 
potential utility, interestingness,
a sense of intuition, a priority queue of instincts.
It is not felt that the complexity of human emotions is 
prerequisite to intelligence;
BEINGs need not simulate emotions.  
No case shall be argued in favor of these
prejudices; rather, we shall test them, observing what happens
in a system which assumes them. This is the subject of Section 2. Before that
discussion, let us formally define some of our terms.


A ⊗4system⊗* of BEINGs, also called a ⊗4pool⊗*,
is a collection of knowledge modules, plus
auxilliary support functions (like "Pattern-match", "Sort"). Each
⊗4BEING⊗* is nothing more than a collection of about thirty parts.
Each ⊗4part⊗* of BEING B is an ordered pair ⊗2(q,a)⊗*, where
⊗2↓_q_↓⊗* is an abbreviated name of a question, and ⊗2↓_a_↓⊗* is a
program which can be run to provide B's answer to ⊗2↓_q_↓⊗*. In the
course of executing ⊗2↓_a_↓⊗*, new questions may be raised, new
BEINGs may be created, parts of existing BEINGs may be changed, and
messages (hopefully including the desired answer) may be passed. 

Once again: the concept of a pool of BEINGs is that many specialized 
entities coexist, each
having a complex structure, but that structure does not vary from entity to entity.
This idea has analogues in many fields: transactional analysis in
psychology, anatomy in medicine, modular design in architechture.

It seems probable 
that the set ⊗2Q = {q}⊗* might need to be immense, to cover all the kinds of
queries any expert might ever want to put to another expert. In fact,
in order to make ||⊗2Q⊗*|| manageably small, it is necessary to restrict the
task domain of the BEINGs system. For example, a set Q↓c↓f of
29 parts can be found which suffice to discuss concept formation, and a
set Q↓[math] of 25 which suffice to discuss mathematical research, but those two
sets are almost disjoint.

.SSEC(How to Build Such a System)

How can we test out 
the feasability of this design scheme for representing knowledge?
We must build a pool of BEINGs, a modular program
which will interact with a human user and 
do some specific task (like write a certain computer program, or propose
some specific mathematical conjecture and then find a proof for it).
Recasting the idea into operational terms, we arrive at this procedure for
constructing a pool of BEINGs: 

.BEGIN INDENT 0,4,0 SKIP 1 SINGLE SPACE PREFACE 1

(1) Study the task which the pool is to do. See
what kinds of questions are asked by (simulated) experts, as they cooperate to
carry out that task.

(2) Distill this into
a core of simple questions, ⊗2Q⊗*,
in such a way that each inter-expert question or transfer
of control can be rephrased in terms of some ⊗2↓_q_↓ ε Q⊗*.
The size of ⊗2Q⊗* is very important.
If Q is too large, addition of new
BEINGs will demand either great effort or great intelligence (an example of a
system like this is ACTORS↑1). If Q is too small, all the non-uniformity is simply
pushed down into the values of one or two general 
catchall questions (all first-order
logical languages do this). 

(3) List all the BEINGs who will be
present in the pool, and fill in their parts. 
The time and effort required to fill in a new BEING is  ⊗4independent⊗* of
the number of BEINGs already in the pool, because BEINGs can communicate
via nondeterministic goal-directed  mechanisms, and do 
not have to know the names of the BEINGs
who will answer their queries. 

(4) The human user interacts with the completed
BEING community, until the desired task is complete.

.END

In the next section, we apply this recipe to the task ⊗4"Write a concept formation
program in LISP, similar to the one described by Winston↑2".⊗*
The problems encountered will guide us to find a new domain (in Section 3) 
and change the formal
requirement that each BEING's set of questions be identical.

.NSEC(Application to Automatic Programming)

The system's task domain is chosen to be Automatic Programming; in particular,
we want a BEINGs system which can interact with a human user and synthesize a
concept formation↑3 program.  
By ⊗4Automatic Programming,⊗* we mean this interactive, nonformal process whereby
a human ⊗4user⊗* specifies a desired program to a ⊗4system⊗*, in a tiny subset of
some natural language, after which the system and user discuss details about the
task, until the desired program, called the
⊗4Target Program⊗*,
is finally synthesized.↑4
Any BEINGs which contain knowledge about concept formation will be called 
⊗4task-specific⊗*; those possessing program-writing knowledge are called
⊗4domain-specific⊗*; the most general BEINGs are ⊗4domain-independent⊗*.


.SSEC(Analogy: Cooperating Human Experts)

We now carry out the four-step procedure outlined
near the end of the last section.
The first step is to examine a community of expert organisms performing the 
program-writing task.
Let us imagine a room filled with humans from a variety of professions:
scientific programmers, psychologists, system hackers, engineers, troubleshooters,
logical theorists, managers, etc. There is one distinguished member: we may consider
him to be the director, sponsor, or ultimate User of the final product.
the user is the one who initially specifies what is wanted. He 
can decide matters of policy, but
the experts are loathe to bother him unnecessarily.
The experts cooperate by asking each other questions, helping answer whatever they
can, and noticing when they might be able to assist. Some of them can write
or modify parts of the partially-synthesized target program; ⊗4all⊗* of them can direct
programmers to do specific encodings.

A hypothetical dialogue was simulated, in which the concept formation program
was discussed and synthesized. This eventually called for 80 different experts,
asking about 10,000 questions altogether, 
all of which fell nicely into 29 categories
of questions. We constructed a BEING to model each particular human expert.

So our experimental system, called PUP6, must consist of a community of 80
BEINGs (and several minor utility functions). Each BEING has 29 parts.
For each part of each BEING,
its ⊗2↓_q_↓⊗* subpart is simply a LISP identifier, an atom; the
⊗2↓_a_↓⊗* subpart is a LISP program, ranging from 1 to 60 lines in
length. 

.TURN OFF "{}"

The concept of anatomy is realized by constraining that each BEING in the PUP6
system have the same set ⊗2Q = {q↓1,...,q↓2↓9}⊗* of questions which it can answer.
Of course the values ⊗2↓_a_↓⊗* are different from individual to
individual.$$ For example, each BEING must have an EFFECTS part, that
is, a part whose question is abbreviated "↓_EFFECTS_↓". This part really answers
"What final effects can you produce; can you acheive this particular
goal...?" The value filled in for the EFFECTS part of a BEING B is simply
a set of ↓_<situation>/<action>_↓ productions: if the desired effect matches
↓_<situation>_↓, 
then it can be achieved by B, by executing the program ↓_<action>_↓. *

Analyzing the simulated experts' discussion, there seems to be a single dominant
scenario for creation of new code. 
They work together to simulate the target concept formation program as it should
run.
When the concept formation program has to carry
out some task X, some expert will recognize his relevance. He will then carry out
X himself, introspect, extract the particular knowledge he is using to do X, and
tell a programmer exactly how he is doing X.↑5  The programmer then oversees the
encoding of this and the merging into the already-synthesized pieces of target code.

In PUP6, new BEINGs are created analogously.
The typical act of BEING creation is when a single, general BEING, say
B, recognizes that he can do the desired target task, but also knows that
he is too general and inefficient to be duplicated precisely inside the
target program. Then B 
introspects, and, together with a programmer-BEING, assembles a
new, specialized, streamlined version of
itself, B', which can do the required task but not much else.
Other BEINGs  try to fill in or extend all the parts of B', so B' can answer
any of the 29 possible questions ⊗2Q⊗*.

The process of filling in all 29 parts of all 80 BEINGs was fairly routine. For
part p of BEING B, we simply looked through the simulated dialogue, collected all
the instances in which expert B answered a question in category p, then wrote a small
collection of facts and procedures which (i) embodied the knowledge that seemed to 
be required to produce these responses, (ii) ensured that in each of these situations,
the response given would be the same as the one in the dialogue. The first point
ensures we are not "cheating", the second point guarantees that the whole dialogue
can be reproduced by the BEINGs system, if the human user of PUP6 
makes exactly the same
requests, comments, and replies as the hypothetical user in the dialgoue.

.SSEC(Details of the Experimental System)

.ONCE TURN ON "[]"
Because of its genesis from a particular protocol dialogue↑6, it is not surprising
that the PUP6 system actually did manage to synthesize the target program. We shall
refer to that concept formation program↑2 as CF. By adding a few more BEINGs to the
system, the user was able to get PUP6 to write a grammatical inference program
(GI) and a simple property-list maintenance program (PL). Despite these promising
results, dialogue flexibility problems detracted greatly from PUP6.
For a detailed treatment of PUP6, see ↑[4,7].


A set of 29 ubiquitous questions were chosen, representing everything one
expert might want to ask another. 
At least, they naturally encompass those questions which were asked during the
simulated meeting, hence should be sufficient for generating CF.
⊗2Q↓c↓f⊗*, this
universal set of BEING parts, is listed in the box below.
One potential flaw of a universal set  of parts is that  each BEING might really
only want to answer a few of the questions; many of them should not be asked of
him. In fact, this superfluity was found to exist. It was not tolerable, and led
to the Families concept, which will be introduced in section 3.
Each part was needed by only about a third of all the BEINGs in the PUP6 system,
in order for PUP6 to be able to generate CF properly.
The percentage of the BEINGs in PUP6
which actually needed each part is also noted below.
.XGENLINES←XGENLINES-3
.BEGIN NOFILL GROUP  PREFACE 0 MILLS  WIDEN 0,3  TURN ON "→∞"  SELECT 8
.ONCE TURN OFF "α"
⊂∞α→⊃
.SELECT 6
⊗8~⊗*⊗2WHAT⊗* 82%  A brief summary of the global purpose, a template for an English phrase.→⊗8~⊗*
⊗8~⊗*⊗2WHY⊗* 77%  A justification of the BEING's existence, partly filled in by the BEING who called it.→⊗8~⊗*
⊗8~⊗*⊗2HOW⊗* 72%  A summary of the methods the BEING intends to employ. Global strategies.→⊗8~⊗*
⊗8~⊗*⊗2IDEN⊗* 54%  How this BEING is referenced in English phrases. Implemented as productions.→⊗8~⊗*
⊗8~⊗*⊗2WHEN⊗* 19%  Why this BEING should be given control now. The sum of weighted factors.→⊗8~⊗*
⊗8~⊗*⊗2ARGS⊗* 63%  Number of arguments; which are optional; names of any local variables.→⊗8~⊗*
⊗8~⊗*⊗2ARG-CHECK⊗* 81%  Predicates which examine each argument for suitability.→⊗8~⊗*
⊗8~⊗*⊗2EVAL-ARGS⊗*  4%  Which arguments are to be evaluated, and which quoted.→⊗8~⊗*
⊗8~⊗*⊗2REQUISITES⊗* 10%  If this BEING is actually chosen, what must be made true prior to (pre)→⊗8~⊗*
⊗8~⊗*		during (co), and after (post) execution.  Work to make these true (unlike ARG-CHECK).→⊗8~⊗*
⊗8~⊗*⊗2DEMONS⊗* 7%  Set of demons to be kept active while the BEING is in control.→⊗8~⊗*
⊗8~⊗*⊗2INHIBIT-DEMONS⊗*  5%  A lock/unlock mechanism, useful when handling demonic interrupts.→⊗8~⊗*
⊗8~⊗*⊗2EFFECTS⊗* 27%  What will probably be true after this BEING is through. What it achieves.→⊗8~⊗*
⊗8~⊗*⊗2RESULTS⊗* 15%  How many values this returns. What domain each is in. Side effects.→⊗8~⊗*
⊗8~⊗*⊗2META-CODE⊗* 70%  The body of the executable code, but with uninstantiated subparts.→⊗8~⊗*
⊗8~⊗*⊗2COMMENTS⊗* 16%  Hints for filling in undefined sections of other BEING parts.→⊗8~⊗*
⊗8~⊗*⊗2STRUCTURE⊗* 4% Viewing this BEING as a data structure, the operations doable to it.→⊗8~⊗*
⊗8~⊗*⊗2AFFECTS⊗* 14%  Other BEINGs which might be called by this BEING, and why.→⊗8~⊗*
⊗8~⊗*⊗2COMPLEXITY⊗* 92%  A vector of utility measures, including probable ease of calling,→⊗8~⊗*
⊗8~⊗*		of success, of (calling)* itself, time/space cost, efficiency of its output results.→⊗8~⊗*
⊗8~⊗*⊗2GENERALIZATIONS⊗* 27%  Some other BEINGs, encompassing this one.→⊗8~⊗*
⊗8~⊗*⊗2ALTERNATIVES⊗* 16%  Some equivalent BEINGs (to try if this one fails).→⊗8~⊗*
⊗8~⊗*⊗2SPECIALIZATIONS⊗* 40%  How to write a streamlined, special-case version of this BEING.→⊗8~⊗*
⊗8~⊗*⊗2ENCODABLE⊗* 9%  How to order the evaluation of the other parts for self-streamlining.→⊗8~⊗*
.SELECT 8
.ONCE TURN OFF "α"
%∞α→$
.TURN OFF "∞→"
.END
.XGENLINES←XGENLINES-2
Each of the 100 BEINGs ultimately present in PUP6 should have
had a value for each part (in reality, only 40% of these 2900 slots were filled in;
only 75% of ⊗4these⊗* were actually used in generating one of the target programs).
The set of parts breaks into three rough categories: those parts which
are useful in deciding which BEING gets control, those useful primarily in
answering the user's questions and keeping him oriented, and those useful when
the BEING gains control.

.BEGIN SINGLE SPACE PREFACE 1

↓_CONTROL:_↓
At the humans' meeting, only one expert spoke at a time; in the BEINGs
community, only one BEING has control at any given moment. He uses his
parts to do things (ask, create, modify), and yields control either
voluntarily or through interruption. 
The parts useful here include
⊗6ARGS, DEMONS, META-CODE, COMMENTS, ARG-CHECK, and REQUISITES.⊗*

↓_ORIENTATION:_↓
In the simulated dialogue, the user had no trouble
whatever understanding what the  experts asked him. In the actual
programmed PUP6 system, the human who was sitting at the teletype quite
⊗4rarely⊗* understood what was wanted by the BEINGs. He frequently had to
interrupt them and ask them questions about who was in control, why, what
he was trying to do, what had recently transpired, etc. These ideally can
be phrased as simple retrievals and EVALs of active BEINGs' parts.
The BEING parts
most often called for by the user are the simple one-line "orientation"
templates. These include WHAT, HOW, WHY, and AFFECTS.  Since BEINGs can only create
new BEINGs (not simple LISP functions),
the synthesized program, CF, was writen as a pool of BEINGs itself
(by PUP6, but not during the original protocol). 
Although its question-answering ability is inferior to PUP6, the fact that CF
had ⊗4any⊗* such power was surprising to the author. In other words,
one can interrupt the synthesized program as it is running
and ask questions. Any BEING on the control stack will provide fully instantiated
answers to any of its 29 allowable queries (its parts); all other BEINGs will provide only
hypothetical answers. 

.END

.SSEC(Results)

Four of the most significant questions for automatic programming systems operating
informally via dialogue are (i)
what kinds of things the user must tell the
system, 
(ii) how it manages to carry on that dialogue,
(iii) what programs are synthesized,
and (iv) how much knowledge is used in more than one synthesis?
The first two questions are answered by recalling that the PUP6
BEINGs are ⊗4by construction⊗* able to
reproduce the specific experts' protocol which resulted in CF. The mechanism is
simply questioning and answering, with new BEINGs generated as side effects of
satisfying some BEINGs' requests. The range of programs synthesized includes CF,
GI (a simple grammatical inference program), and PL (an even simpler property list
manipulation program).  The final question can be answered easily in a BEINGs
system: in the case of PUP6, 39 BEINGs were used in synthesizing ⊗4all three⊗*
target programs; there were about 100 BEINGs in total.

.SSSEC(Character of the Dialogue between User and PUP6)

By lumping all the parts of the PUP6 system together conceptually, the
interaction is seen as a ⊗4dialogue⊗* between a human user and an automatic
programming system.  We present below the initial few lines of the actual
300-page long dialogue which resulted in CF being generated. This is shown as
the human user of PUP6 might see it.
.XGENLINES←XGENLINES-1
.BEGIN NOFILL GROUP  SELECT 8   PREFACE 0 MILLS  TURN ON "→∞"
.D1PAGE←PAGE

.ONCE TURN OFF "α"
⊂∞α→⊃
.SELECT 2
⊗8~⊗* PUP: I want any task.→⊗8~⊗*
⊗8~⊗* ⊗4USER: Write a program which does concept formation.⊗*→⊗8~⊗*
⊗8~⊗* PUP: I am about to write a program for ⊗6specialized CONCEPT-FORMATION⊗*.→⊗8~⊗*
⊗8~⊗* PUP: Give me a proper name for this program. (I suggest CF or CONCEPT)→⊗8~⊗*
⊗8~⊗* ⊗4USER: CF.⊗*→⊗8~⊗*
⊗8~⊗* PUP: I am temporarily replacing the task "⊗6specialized form of CONCEPT-FORMATION⊗*"→⊗8~⊗*
⊗8~⊗*		by "⊗6specialized form of PARTITION-A-DOMAIN⊗*".→⊗8~⊗*
⊗8~⊗* PUP: Must CF mimic different human subjects? (respond YES, NO, ?, or DETAILS)→⊗8~⊗*
⊗8~⊗* ⊗4USER: NO.⊗*→⊗8~⊗*
.SELECT 8
.ONCE TURN OFF "α"
%∞α→$
.END
From the above, one gets the true impression that the dialogue was fairly tedious:
choosing names and resolving design decisions ⊗4was⊗* 
the user's main role. There were,
in sum, about 300 such responses involved in generating CF. 

The system was specified by collecting knowledge from one single protocol;
as a result, each BEING had only those
parts specified which it actually would need in the ensuing dialogue.
Part of the difficulty with new dialogues stemmed from this minimal completion.
In the protocol, when a decision was made by experts,
the knowledge necessary to follow the
⊗4other⊗* alternative branch was not used by the expert at that time, 
nor were such superfluous
facts supplied to the BEINGs in PUP6. Thus
the user of PUP6 must almost always resolve each choice the way the simulated
(protocol) user did, or else he will draw on knowledge which the experts had
(but didn't use before) but which PUP6 lacks.
It is felt that if all the parts of all the BEINGs had been faithfully filled
in, this problem would have subsided. Basically, the difficulty is one of
modelling all the possibly relevant knowledge an expert has, rather than
(as was done) just capturing enough of his knowledge to do a few given tasks.
.SSEC(Conclusions)

Some of the problems encountered in PUP6 were due to incomplete
filling in of BEINGs, poor choice of ⊗2Q⊗*, absence of needed BEINGs
(like "DEBUG"), overly simple translation mechanisms, limited channels
for communication, the need for every BEING to conform to a single 
set Q of questions, the creation of a system designed to do just one particular
task. 

Some of the difficulties stem from the nature of the task. In any long dialogue,
the user often forgets, changes his mind, errs, etc. A very sophisticated user
model would be necessary to accomodate this errorful process in a
non-debugging system. 
Without such abilities, the system itself may be led into error.
While most bugs ⊗4are⊗* avoidable by careful
record-keeping, it proved unrealistic to make no provision for debugging a
new thirty-page program. When a few errors did occur, PUP6 itself had
to be altered.  


The performance of the BEINGs representation itself in PUP6 is mixed.
Two advantages were hoped for by using a uniform set of BEING parts.
Addition of new BEINGs to the pool was not easy (for untrained users)
but communication among
BEINGs ⊗4was⊗* easy (fast, natural). Two
advantages were hoped for by keeping the BEINGs highly structured.
The interactions (especially with the user) were
brittle, but
the complex tasks put to the pool ⊗4were⊗* successfully completed.

The crippling problems are seen to be with user-system communication,
not with the ideas dicussed in Section 1.
Sophisticated, bug-free programs ⊗4were⊗* generated, after hours of fairly high
level dialogue with an active user, after tens of thousands of messages passed
among the BEINGs.
Part of this success is attributed to distributing
the responsibility for writing code and for recognizing relevance, to a hundred
entities, rather than having a few central monitors worry about everything.
The standardization of parts made filling in the BEINGs' contents fairly painless.

To continue our research, we must
find a task where BEINGs are well-suited, where
the problems encountered in PUP6 won't recur. What ⊗4are⊗* BEINGs good for?
The idea of a fixed set of parts (which distinguishes them from ACTORs↑1) is
useful if the mass of knowledge is 
too huge for one individual to keep "on top" of.
It then should be organized in a
very uniform way (to simplify preparing it for storage), 
yet it must also be highly structured
(to speed up retrieval). 

The new domain should be one in which the problems 
centarl to our research are isolated
from the staggering complexities of natural
language handling. For these reasons, we next consider
automated research in the field of elementary number theory, 
as a potential task domain.
BEINGs are big and slow, but valuable for organizing knowledge in ways 
meaningful to how it will be used. In 
the system described in the next section, BEINGs will be one
-- but not the only -- 
internal mechanism for representing and manipulating knowledge.

.NSEC(Application to Mathematics Research)

.SSEC(Correcting difficulties of the Automatic Programming system)

The most devastating criticism of BEINGs was that, in PUP6, each BEING wanted only
about 10 of the 29 possible parts filled in; that is, each BEING part was
really needed only by a third of all the BEINGs,
in order for PUP6 to synthesize the concept formation program successfully.
Analyzing this problem leads us
back to the cooperating experts analogy. There, we find the additional structural
level of ⊗4professions⊗*.
Experts within a given profession can communicate
some specialized questions and answers which would be unintelligible to 
outsiders.$$ For example, the hackers' jargon.
Thus, our group of experts was not really uniform, but rather structured along
occupational lines. *
Of course, many of the questions are found in more than one profession; a few are
really universal.

By changing the domain of the BEINGs' activities to elementary mathematics↑1↑0, 
the natural communication problems 
mentioned in the last section
become manageable.  A small core of primitive constructions must be mastered,
but after that, 
it is the user's responsibility to define
any additional phrases. This is true
in "real" mathematical research, and makes translation fairly routine.
A second advantage of this task domain is that  there is not any one specific
task which the system is expected to achieve.
A ⊗4solution⊗* to this task would mean successfully
accounting for the ⊗4driving⊗* and the ⊗4pruning⊗* forces which
result in interesting
mathematical theories being developed. Success
could be measured in operational terms, by applying these forces to
various domains of mathematics, and comparing the results to what is
already known in those fields.

Let us consider, then, the task of doing mathematical research: defining 
potentially interesting mathematical systems, and developing theorems about
them. Each BEING will represent a conceptualization: a mathematical
construction, an activity, a meta-concept, etc.  The BEINGs will be grouped
into a few families or professions; 
each family f has its own distinctive set ⊗2Q↓f⊗*
of questions which its members can answer.

.SSEC(Characteristics of a Proposed System)

.BEGIN SINGLE SPACE PREFACE 1

Before detailing such a system, let us try to explain the motivations and the
mechanisms by which mathematicians operate.↑1↑1

.BEGIN INDENT 0,4,0

(i) The driving and pruning forces are (in decreasing order of importance) 
aesthetics/interestingness,
intuition, utility, analogy, inductive inference (empirical), 
and deductive inference (formal).

(ii) Each of these forces is useful both in generating new conjectures, and
in assessing their acceptability.

(iii) If the essence of these ideas can be factored out into an explicit set
of BEINGs and utility functions, then they can be used to
develop almost any branch of mathematics, at almost any level.

(iv) Each mathematical concept should be represented in several ways, 
including declarative, operational, 
demonic (recognizing its relevance),
exemplary (especially boundary
examples), and intuitive.

(v) A wide foundation of intuition, spanning several mathematical and 
real-world concepts, is prerequisite to sophisticated behavior in ⊗4any⊗*
branch of mathematics.  

.END

Now consider the characteristics of
a man-machine system which could be used  experimentally for this task.
The initial knowledge in the system will consist of (i) specific facts about
mathematics, reasoning, programming, and communication, (ii) strategies
for filling out parts of incomplete BEINGs, (iii) opaque functions
which simulate parts of Nature, and (iv) opaque judgment criteria
for aesthetics, interest, utility, complexity, etc.
The specific facts are
organized into 4 families of Beings; each family initially has about 35 BEINGs, and
each BEING has about 25 parts. The families are: 
Static (eg, sets), Active (eg, relation),
Static-Meta (eg, analogy), and Active-Meta (eg, prove).
For uniformity, the strategies form a fifth family of BEINGs, called Archetypical BEINGs.
A strategy BEING is simply a collection of facts for 
dealing with a particular
type of BEING part (e.g., the "Examples" 
BEING contains suggestions for filling in the
Examples part of any BEING).

The intuition functions represent the environment and are ⊗4opaque⊗*. 
For example, a model of a seesaw might exist, and the
system could play around at varying the weights on each side and their distance from
the fulcrum, and the seesaw function would explain which side sank and how fast.
This might be useful in getting an intuition about multiplication, substitution,
or symmetry.
	The quantity of this corpus appears large (about 3000 BEING-parts to
encode, each as a little LISP program), and it is of some interest to hope that
the very same techniques which lead to discovering new mathematical knowledge
later on might be able to "grow" this knowledge base from a much smaller core --
say a collection of 100 BEINGs with only a few parts filled in for each.
The first activity of such a system, then, would be ⊗4contemplative⊗*: the
interaction with the user would be minimal.

The user is considered slow and dangerously contradicatory, hence not a good channel
to obtain data in general.
But as the known information swells, the need for guidance also grows. 
At some point the
system may simply be swamped by a multitude of equally-mediocre alternatives to
investigate. 
It will then (abeit reluctantly) request  direction from
a human user, in what is to be an ⊗4assimilative⊗* phase. 
These teachings should be the
core definitions of a specific field, and of course should be based on what is 
already mastered. The first experiences could be in set theory, Boolean algebra,
abstract algebra, logic, or
arithmetic.  This will probably be the level finally attained by the actual system.

One higher mode of interaction is conceivable: that of a colleague in research.
In conjunction with a
human adviser, the system would propose and explore interesting new relationships,
decide which creations to name, explore the intuitive meanings of statements, etc.
Hopefully, the reader has balked, complaining that this sounds just like the earlier
phases. In fact, the system will not ring a bell and suddenly switch its activities;
it has no way of knowing that its discovery of PLUS is not new to Mankind.  
The driving and pruning forces in all phases are the same: use 
aesthetics and utility judgments to fill out parts of incomplete BEINGs.
If the guidance of the human turns out to be
important, however, then it will come as no surprise if
the flavor of the interactions changes as the system enters a realm unfamiliar
to the user.

The overall control flow would be a series of 
⊗4"Work on part p of BEING B"⊗* directives.
The driving/pruning forces would each time select, as the next (p,B) pair,
the one which then seemed most promising.
During the course of such
completions, new BEINGs might be called for (split off rich 
subparts of part of an already-exisiting BEING).  


The basic mechanism is  the filling in and the running of parts of BEINGs.
But BEING parts are generally procedural knowledge, so this task really means
automatic code synthesis. Knowledge is stored with an idea toward future usage,
both in where it is placed and how it is recorded.
This is the logical continuation of the usage-oriented storage originating in
PUP1↑4 and developed into BEINGs in PUP6 (Section 2 of this paper).


For automatic programming, ⊗2Q↓c↓f⊗* divided into three groups of questions:
recognizing relevance, exerting control, and general orientation information.
For math theory formation, a set ⊗2Q↓[math]⊗* of 25 questions has been formulated,
divided into four groups: ⊗2recognizing relevance⊗* 
⊗7(Changes, Final, Past, Iden)⊗*,
⊗2altering oneself⊗* ⊗7(Generalizations, 
Specializations, Boundary, Domain/Range, Ordering,
Worth, Interestingness, Justification, Operations/Properties)⊗*, 
⊗2acting upon another BEING⊗*
⊗7(Boundary-operations, Fillin, Structure, Algorithms, Check, Representation,
Views)⊗*, and ⊗2general information⊗* ⊗7(Definition, 
Intuition, Ties to related BEINGs,
Examples, Contents)⊗*.  Each family or profession has its own distinctive
subset of these parts which all its members must relate to. For example,
BEINGs representing activities relate to Domain/Range, whereas static BEINGs don't.


It is ⊗4not⊗* optimal to have only one representation of knowledge in a 
system;
while multiple knowledge formalisms create interfacing difficulties, the advantages
of expressing a piece of information in the best-suited way is considered worth the
headaches of interfacing.   BEINGs are fine for storing structured information in an
accessable manner, but much of the system will be inaccessable (e.g., intuition).

The specific math knowledge has many sophisticated duties,
so it seems appropriate that BEINGs be used to hold and organize
this information. The main cost, that of slowness, is not critical
here, since each individual chunk is used infrequently, and a wrong
usage is far more serious than a slow usage.  One final factor in
favor of using BEINGs here is that all the knowledge that is
available at the time of creation of a new BEING will find its way to the
right place; any missing knowledge will be conspicuous as a blank or
incomplete BEING part. 

The contents of each part of each BEING is composed of specialized rules,
assertions, and pointers to other parts and other BEINGs. The
knowledge may have to be altered from time to time, hence must be
inspectable and interpretable meaningfully and easily, so compiled
code is ruled out. To demand that each part of each BEING be itself a BEING
would trivially cause an infinite regress. Hence the reliance upon
"intermediate" representations and strategy BEINGs.

.END

.SSEC(Projected Results)

There are several possible outcomes of all this. 
Ideally, the system will find a useful
redivision of some concepts, and new concepts and theories
overlooked by mathematicians.
The next best result would be the re-discovery and re-development of some
existing mathematical theories, perhaps by being carefully pushed along the "right"
path.
Even if the system never gets beyond the  most elementary levels in each
field, that very failure will indicate for the first time a lower bound
on the magnitude of the theory formation problem. 

.SSEC(Synthesis)

Before closing, a brief review is in order. We began by examining why
biological entities dominate Earth, progressed to observing cooperating
human experts, and finally abstracted out a manageable set of design
constraints for a viable system. 
These were: discrete modules containing interrelated knowledge on very
specialized topics, each with the same anatomy but with a different area of
expertise, capable of querying, answering, and reproducing.

The first implementation, in the domain
of Automatic Programming, was partially successful. The difficulties
promted us to choose a new domain (Elementary Number Theory Research) and weaken
our constraints (uniform anatomy only within each family).
The role of knowledge modules is central to the new system, but not universal: 
wherever a more efficient representation of knowledge is possible, it 
should be employed.

.ASEC(References)
.GROUP SKIP 1
.BEGIN  FILL  SPACING 0 MILLS  PREFACE 10 MILLS  WIDEN 6,8  INDENT 0,5,0

↑1 C. Hewitt, "A Universal Modular ACTOR Formalism for
Artificial Intelligence," ⊗43rd International Joint Conference
on Artificial Intelligence⊗*, 1973, pp. 235-245.

↑2 P. Winston, ⊗4Learning Structural Descriptions
from Examples⊗*, Ph.D. thesis, Dept. of Electrical Engineering,
TR-76, Project MAC, TR-231, Artificial Intelligence Laboratory,
Massachusetts Institute of Technology, Cambridge, Massachusetts,
September, 1970.

↑3 C. G. Hempel, ⊗4Fundamentals of Concept Formation in
Empirical Science⊗*, University of Chicago, Chicago, Illinois,
1952.

↑4 C. Green, R. Waldinger, D. Barstow, R. Elshlager, D. Lenat, B. McCune, D. Shaw,
L. Steinberg,
"Progress Report on Program-Understanding Systems", ⊗4SAIL Memo AIM-240,
CS Report STAN-CS-74-444,⊗* Artificial Intelligence Laboratory,
Stanford University, August, 1974.

↑5 W. A. Woods and J. Makhoul, "Mechanical Inference
Problems in
Continuous Speech Understanding, ⊗4Third International Joint Conference
on Artificial Intelligence⊗*, 1973, pp. 200-207.

↑6 R. Floyd, "Toward Interactive Design of Correct 
Programs", ⊗4IFIP⊗* 1971, V. 1, pp. 7-10.

↑7 D. Lenat, "Synthesis of Large Programs from Specific Dialogues",
⊗4Proceedings of the Symposium on Proving and Improving Programs⊗*, 
Institut de Recherche d'Informatique et d'Automatique, July, 1975.

↑8 P. Winston, ed., "New Progress in Artificial Intelligence",
⊗4MIT AI Lab Memo AI-TR-310⊗*, June, 1974. 
Note especially the summaries of work on Frames,
Demons, Hacker, Heterarchy, Dialogue, and Belief.

↑9 W. Teitelman, ⊗4INTERLISP Reference Manual⊗*, XEROX PARC, 1974.  Use was also
made of features of CLISP, QA4 (J. Rulifson), and QLISP (R. Reboh and E. 
Sacerdoti).

↑1↑0 R. B. Kershner and L. R. Wilcox, ⊗4The Anatomy of Mathematics⊗*, The Ronald
Press Company, New York, 1950.

↑1↑1 J. Hadamard, ⊗4The Psychology of Invention in the Mathematical
Field⊗*, Dover Publications, New York, 1945.

↑1↑2 R. Hilpinen, "Rules of Acceptance and Inductive Logic", ⊗4Acta
Philosophica Fennica⊗*, Fasc. 22, North-Holland Publishing Company,
Amsterdam, 1968. Also see J. Pietarinen, "Lawlikeness, Analogy, and
Inductive Logic", in Fasc. 26 of the same journal, 1972.
.END


.ASSEC(Acknowledgements)

The ideas and the system described are built upon recent researches. Many
hours of creative discussions were equally important.
In particular, the author  acknowledges the contributions by
C. Green, A. Cohn, R. Waldinger,
E. Sacerdoti, and M. Wilber.
Computer time for the PUP6 research was generously provided by the Artificial
Intelligence Center of SRI.

.EVERY HEADING(,,)
.EVERY FOOTING(,,)
.PORTION CONTENTS
.NOFILL
.NARROW 6,8
.BEGIN CENTER
.GROUP SKIP 3

.SELECT 5
DUPLICATION OF HUMAN ACTIONS
BY AN INTERACTING COMMUNITY OF KNOWLEDGE MODULES

.SELECT 2


Douglas B. Lenat

Artificial Intelligence Laboratory
Stanford University





.END
.PREFACE 10 MILLS
.TURN ON "{∞→"
.NARROW 7,7
.RECEIVE

.CENTER






.SELECT 7
Mr. Douglas B. Lenat
Artificial Intelligence Laboratory
Computer Science Department
Stanford University
Stanford, California 94305, U.S.A.
.SELECT 1





Suggested running head:  ⊗4↓_Interacting Knowledge Modules_↓⊗*







.SELECT 7
{DATE}
.PAGE←0
.SELECT 1